Spectral conversion based on statistical models including time-sequence matching

نویسندگان

  • Yoshihiko Nankaku
  • Kenichi Nakamura
  • Tomoki Toda
  • Keiichi Tokuda
چکیده

This paper proposes a spectral conversion technique based on a new statisticalmodel which includes time-sequence matching. In conventional GMM-based approaches, the Dynamic Programming (DP) matching between source and target feature sequences is performed prior to the training of GMMs. Although a similarity measure of two frames, e.g., the Euclid distance is typically adopted, this might be inappropriate for converting the spectral features. The likelihood function of the proposed model can directly deal with two different length sequences, in which a frame alignment of source and target feature sequences is represented by discrete hidden variables. In the proposed algorithm, the maximum likelihood criterion is consistently applied to the training of model parameters, sequence matching and spectral conversion. In the subjective preference test, the proposed method is superior than the conventionalGMM-based method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simultaneous conversion of duration and spectrum based on statistical models including time-sequence matching

This paper describes a simultaneous conversion technique of duration and spectrum based on a statistical model including time-sequence matching. Conventional GMM-based approaches cannot perform spectral conversion taking account of speaking rate because it assumes one to one frame matching between source and target features. However, speaker characteristics may appear in speaking rates. In orde...

متن کامل

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation

The performance of voice conversion has been considerably improved through statistical modeling of spectral sequences. However, the converted speech still contains traces of artificial sounds. To alleviate this, it is necessary to statistically model a source sequence as well as a spectral sequence. In this paper, we introduce STRAIGHT mixed excitation to a framework of the voice conversion bas...

متن کامل

Fast Unsupervised Automobile Insurance Fraud Detection Based on Spectral Ranking of Anomalies

Collecting insurance fraud samples is costly and if performed manually is very time consuming. This issue suggests usage of unsupervised models. One of the accurate methods in this regards is Spectral Ranking of Anomalies (SRA) that is shown to work better than other methods for auto insurance fraud detection specifically. However, this approach is not scalable to large samples and is not appro...

متن کامل

Dynamic model selection for spectral voice conversion

Statistical methods for voice conversion are usually based on a single model selected in order to represent a tradeoff between goodness of fit and complexity. In this paper we assume that the best model may change over time, depending on the source acoustic features. We present a new method for spectral voice conversion called Dynamic Model Selection (DMS), in which a set of potential best mode...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007